=begin pod :tag<perl6>

=TITLE Lists, Sequences, and Arrays

=SUBTITLE Positional data constructs

Lists have been a central part of computing since before there
were computers, during which time many devils have taken up residence in
their details. They were actually one of the hardest parts of Perl 6
to design, but through persistence and patience, Perl 6 has arrived with
an elegant system for handling them.

=head1 Literal Lists

Literal L<C<List>s|/type/List> are created with commas and semicolons B<not>
with parentheses, so:

    1, 2;        # This is two-element list
    our $list = (1, 2);      # This is also a List, in parentheses
    $list = (1; 2);      # same List (see below)
    $list = (1);         # This is not a List, just a 1 in parentheses
    $list = (1,);        # This is a one-element List

There is one exception, empty lists are created with just parenthesis:

=for code :skip-test
    ();          # This is an empty List
    (,);         # This is a syntax error

Note that hanging commas are just fine as long as the beginning and
end of a list are clear, so feel free to use them for easy code editing.

Parentheses can be used to mark the beginning and end of a C<List>, so:

    (1, 2), (1, 2); # This is a list of two lists.

C<List>s of C<List>s can also be created by combining comma and semicolon.
This is also called multi-dimensional syntax, because it is most often used to
index multidimensional arrays.

    say so (1,2; 3,4) eqv ((1,2), (3,4));
    # OUTPUT: «True␤»
    say so (1,2; 3,4;) eqv ((1,2), (3,4));
    # OUTPUT: «True␤»
    say so ("foo";) eqv ("foo") eqv (("foo")); # not a list
    # OUTPUT: «True␤»

Unlike a comma, a hanging semicolon does not create a multidimensional list
in a literal.  However, be aware that this behavior changes in most argument
lists, where the exact behavior depends on the function... but will usually
be:

    say('foo';);   # a list with one element and the empty list
    # OUTPUT: «(foo)()␤»
    say(('foo';)); # no list, just the string "foo"
    # OUTPUT: «foo␤»

Because the semicolon doubles as a
L<statement terminator|/language/control#statements>
it will end a literal list when used at the top level, instead creating
a statement list.  If you want to create a statement list inside parenthesis,
use a sigil before the parenthesis:

    say so (42) eqv $(my $a = 42; $a;);
    # OUTPUT: «True␤»
    say so (42,42) eqv (my $a = 42; $a;);
    # OUTPUT: «True␤»

Individual elements can be pulled out of a list using a subscript.  The
first element of a list is at index number zero:

    =begin code :skip-test
    say (1, 2)[0];  # says 1
    say (1, 2)[1];  # says 2
    say (1, 2)[2];  # says Nil
    say (1, 2)[-1]; # Error
    say ((<a b>,<c d>),(<e f>,<g h>))[1;0;1]; # says "f"
    =end code

=head1 The @ sigil

Variables in Perl 6 whose names bear the C<@> sigil are expected to
contain some sort of list-like object. Of course, other variables may
also contain these objects, but C<@>-sigiled variables always do, and
are expected to act the part.

By default, when you assign a C<List> to an C<@>-sigiled variable, you create
an C<Array>. Those are described below. If, instead you want to put an actual
C<List> into an C<@>-sigiled variable, you can use binding with C<:=> instead.

    my @a := 1, 2, 3;

One of the ways C<@>-sigiled variables act like lists is by always supporting
L<positional subscripting|/language/subscripts>. Anything bound to a C<@>-sigiled
value must support the L<Positional|/type/Positional> role which guarantees this:

    my @a := 1; # Type check failed in binding; expected Positional but got Int

=head1 Reset a List Container

To remove all elements from a Positional container assign
L<C<Empty>|/type/Slip#Empty>, the empty list C<()> or a C<Slip> of the empty
list to the container.

    my @a = 1, 2, 3;
    @a = ();
    @a = Empty;
    @a = |();

=head1 Iteration

All lists may be iterated, which means taking each element from the
list in order and stopping after the last element:

    for 1, 2, 3 { .say }  # OUTPUT: «1␤2␤3␤»

=head1 Testing for Elements

To test for elements convert the C<List> or C<Array> to a L<C<Set>|/type/Set>
or use a Set L<operator|/language/setbagmix>.

    my @a = <foo bar buzz>;
    say @a.Set<bar buzz>; # OUTPUT: «(True True)␤»
    say so 'bar' ∈ @a;    # OUTPUT: «True␤»

=head2 Sequences

Not all lists are born full of elements.  Some only create as many elements
as they are asked for.  These are called sequences, which are of type L<Seq|/type/Seq>.
As it so happens, loops return C<Seq>s.

    (loop { 42.say })[2]  # OUTPUT: «42␤42␤42␤»

So, it is fine to have infinite lists in Perl 6, just so long as you never
ask them for all their elements.  In some cases, you may want to avoid
asking them how long they are too – Perl 6 will try to return C<Inf> if
it knows a sequence is infinite, but it cannot always know.

=comment TODO link or describe C<...>

Although the C<Seq> class does provide some positional subscripting, it does
not provide the full interface of C<Positional>, so an C<@>-sigiled variable
may B<not> be bound to a C<Seq>.

    my @s := Seq.new(<a b c>); CATCH { default { say .^name, ' ', .Str } }
    # OUTPUT «Type check failed in binding to $iter; expected Iterator but got List ($("a", "b", "c"))␤  in block <unit> at <tmp> line 1␤␤»

This is because the C<Seq> does not keep values around after you have used them.
This is useful behavior if you have a very long sequence, as you may want to
throw values away after using them, so that your program does not fill up memory.
For example, when processing a file of a million lines:

    =begin code :preamble<sub do-something-with($x){ }>
    for 'filename'.IO.lines -> $line {
        do-something-with($line);
    }
    =end code

You can be confident that the entire content of the file will not stay around
in memory, unless you are explicitly storing the lines somewhere.

On the other hand, you may want to keep old values around in some cases.
It is possible to hide a C<Seq> inside a C<List>, which will still be lazy,
but will remember old values. This is done by calling the C<.list> method.
Since this C<List> fully supports C<Positional>, you may bind it directly
to an C<@>-sigiled variable.

    my @s := (loop { 42.say }).list;
    @s[2]; # says 42 three times
    @s[1]; # does not say anything
    @s[4]; # says 42 two more times

You may also use the C<.cache> method instead of C<.list>, depending
on how you want the references handled.  See the L<page on C<Seq>|/type/Seq>
for details.

=comment TODO document .iterator

=head2 Slips

Sometimes you want to insert the elements of a list into another list.
This can be done with a special type of list called a L<Slip|/type/Slip>.

    say (1, (2, 3), 4) eqv (1, 2, 3, 4);         # OUTPUT: «False␤»
    say (1, Slip.new(2, 3), 4) eqv (1, 2, 3, 4); # OUTPUT: «True␤»
    say (1, slip(2, 3), 4) eqv (1, 2, 3, 4);     # OUTPUT: «True␤»

Another way to make a C<Slip> is with the C<|> prefix operator.  Note that
this has a tighter precedence than the comma, so it only affects a single
value, but unlike the above options, it will break L<Scalars|/type/Scalar>.

    say (1, |(2, 3), 4) eqv (1, 2, 3, 4);        # OUTPUT: «True␤»
    say (1, |$(2, 3), 4) eqv (1, 2, 3, 4);       # OUTPUT: «True␤»
    say (1, slip($(2, 3)), 4) eqv (1, 2, 3, 4);  # OUTPUT: «False␤»

=head1 Lazy Lists
X<|lazy (property of List)>

Lists can be lazy, what means that their values are computed on demand and
stored for later use. To create a lazy list use
L<gather/take|/language/control#gather/take> or the
L<sequence operator|/language/operators#infix_...>. You can also write a class that
implements the role L<Iterable|/type/Iterable> and returns C<True> on a call to
L<is-lazy|/routine/is-lazy>. Please note that some methods like C<elems> cannot be
called on a lazy List and will result in a thrown L<Exception|/type/Exception>.

    # This list is lazy and elements will not be available
    # until explicitly requested.

    my @l = lazy 0..5;
    say @l.is-lazy;     # OUTPUT: «True␤»
    say @l[];           # OUTPUT: «[...]␤»

    # Once all elements have been retrieved, the List
    # is no longer considered lazy.

    my @no-longer-lazy = eager @l;           # Forcing eager evaluation
    say @no-longer-lazy.is-lazy;             # OUTPUT: «False␤»
    say @no-longer-lazy[];                   # OUTPUT: «[0 1 2 3 4 5]␤»

A common use case for lazy Lists are the processing of infinite sequences of numbers,
whose values have not been computed yet and cannot be computed in their entirety.
Specific values in the List will only be computed when they are needed.

    my @l = 1, 2, 4, 8 ... Inf;
    say @l[0..16];
    # OUTPUT: «(1 2 4 8 16 32 64 128 256 512 1024 2048 4096 8192 16384 32768 65536)␤»

=head1 Immutability

The lists we have talked about so far (C<List>, C<Seq> and C<Slip>)
are all immutable.  This means you cannot remove elements from them,
or re-bind existing elements:

    =begin code
    (1, 2, 3)[0]:delete; # Error Can not remove elements from a List
    (1, 2, 3)[0] := 0;   # Error Cannot use bind operator with this left-hand side
    (1, 2, 3)[0] = 0;    # Error Cannot modify an immutable Int
    =end code

However, if any of the elements is wrapped in a L<C<Scalar>|/type/Scalar> you
can still change the value which that C<Scalar> points to:

    my $a = 2;
    (1, $a, 3)[1] = 42;
    $a.say;            # OUTPUT: «42␤»

...that is, it is only the list structure itself – how many elements there are
and each element's identity – that is immutable.  The immutability is not
contagious past the identity of the element.

=head1 List Contexts

So far we have mostly dealt with lists in neutral contexts.  Lists are actually
very context sensitive on a syntactical level.

=head2 List Assignment Context

When a list appears on the right hand side of an assignment into a C<@>-sigiled
variable, it is "eagerly" evaluated.  This means that a C<Seq> will be iterated
until it can produce no more elements.  This is one of the places you do not want
to put an infinite list, lest your program hang and, eventually, run out of memory:

    my $i = 3;
    my @a = (loop { $i.say; last unless --$i }); # OUTPUT: «3␤2␤1␤»
    say "take off!";

=head2 Flattening "Context"

When you have a list that contains sub-lists, but you only want one flat list,
you may flatten the list to produce a sequence of values as if all parentheses
were removed. This works no matter how many levels deep the parentheses are nested.

    say (1, (2, (3, 4)), 5).flat eqv (1, 2, 3, 4, 5) # OUTPUT: «True␤»

This is not really a syntactical "context" as much as it is a process of
iteration, but it has the appearance of a context.

Note that L<C<Scalar>s|/type/Scalar> around a list will make it immune to
flattening:

    for (1, (2, $(3, 4)), 5).flat { .say } # OUTPUT: «1␤2␤(3 4)␤5␤»

...but an C<@>-sigiled variable will spill its elements.

    my @l := 2, (3, 4);
    for (1, @l, 5).flat { .say };      # OUTPUT: «1␤2␤3␤4␤5␤»
    my @a = 2, (3, 4);                 # Arrays are special, see below
    for (1, @a, 5).flat { .say };      # OUTPUT: «1␤2␤(3 4)␤5␤»

=head2 Argument List (Capture) Context

When a list appears as arguments to a function or method call, special
syntax rules are at play: the list is immediately converted into a
C<Capture>.  A C<Capture> itself has a List (C<.list>) and a Hash (C<.hash>).
Any C<Pair> literals whose keys are not quoted, or which are not parenthesized,
never make it into C<.list>.  Instead, they are considered to be named
arguments and squashed into C<.hash>.  See the L<page on C<Capture>|/type/Capture>
for the details of this processing.

Consider the following ways to make a new C<Array> from a C<List>.  These ways
place the C<List> in an argument list context and because of that, the C<Array>
only contains C<1> and C<2> but not the C<Pair> C<:c(3)>, which is ignored.

    Array.new(1, 2, :c(3));
    Array.new: 1, 2, :c(3);
    new Array: 1, 2, :c(3);

In contrast, these ways do not place the C<List> in argument list context,
so all the elements, even the C<Pair> C<:c(3)>, are placed in the C<Array>.

    Array.new((1, 2, :c(3)));
    (1, 2, :c(3)).Array;
    my @a = 1, 2, :c(3); Array.new(@a);
    my @a = 1, 2, :c(3); Array.new: @a;
    my @a = 1, 2, :c(3); new Array: @a;

In argument list context the C<|> prefix operator applied to a C<Positional>
will always slip list elements as positional arguments to the Capture,
while a C<|> prefix operator applied to an C<Associative> will slip pairs in
as named parameters:

    my @a := 2, "c" => 3;
    Array.new(1, |@a, 4);    # Array contains 1, 2, :c(3), 4
    my %a = "c" => 3;
    Array.new(1, |%a, 4);    # Array contains 1, 4

=head2 Slice Indexing Context

From the perspective of the C<List> inside a L<slice subscript|/language/subscripts#Slices>,
is only remarkable in that it is unremarkable: because
L<adverbs|/language/subscripts#Adverbs> to a  slice are attached after the C<]>,
the inside of a slice is B<not> an argument list, and no special processing
of pair forms happens.

Most C<Positional> types will enforce an integer coercion on each element
of a slice index, so pairs appearing there will generate an error, anyway:

    =begin code
    (1, 2, 3)[1, 2, :c(3)] # OUTPUT: «Method 'Int' not found for invocant of class 'Pair'␤»
    =end code

...however this is entirely up to the type – if it defines an order
for pairs, it could consider C<:c(3)> a valid index.

Indices inside a slice are usually not automatically flattened, but
neither are sublists usually coerced to C<Int>.  Instead, the list structure
is kept intact, causing a nested slice operation that replicates the
structure in the result:

    say ("a", "b", "c")[(1, 2), (0, 1)] eqv (("b", "c"), ("a", "b")) # OUTPUT: «True␤»

=head2 Range as Slice

A L<C<Range>|/type/Range> is a container for a lower and a upper boundary.
Generating a slice with a C<Range> will include any index between those bounds,
including the bounds. For infinite upper boundaries we agree with
mathematicians that C<Inf> equals C<Inf-1>.

    my @a = 1..5;
    say @a[0..2];     # OUTPUT: «(1 2 3)␤»
    say @a[0..^2];    # OUTPUT: «(1 2)␤»
    say @a[0..*];     # OUTPUT: «(1 2 3 4 5)␤»
    say @a[0..^*];    # OUTPUT: «(1 2 3 4 5)␤»
    say @a[0..Inf-1]; # OUTPUT: «(1 2 3 4 5)␤»

=head2 Array Constructor Context

Inside an Array Literal, the list of initialization values is not in capture
context and is just a normal list.  It is, however, eagerly evaluated just as
in assignment.

    say so [ 1, 2, :c(3) ] eqv Array.new((1, 2, :c(3))); # OUTPUT: «True␤»
    [while $++ < 2 { 42.say; 43 }].map: *.say;           # OUTPUT: «42␤42␤43␤43␤»
    (while $++ < 2 { 42.say; 43 }).map: *.say;           # OUTPUT: «42␤43␤42␤43␤»

Which brings us to Arrays...

=head1 Arrays

Arrays differ from lists in three major ways: Their elements may be typed,
they automatically itemize their elements, and they are mutable.  Otherwise
they are Lists and are accepted wherever lists are.

    say Array ~~ List     # OUTPUT: «True␤»

A fourth, more subtle, way they differ is that when working with Arrays, it
can sometimes be harder to maintain laziness or work with infinite sequences.

=head2 Typing

X<|typed array>X<|[ ] (typed array)>
Arrays may be typed such that their slots perform a typecheck whenever
they are assigned to.  An Array that only allows C<Int> values to be assigned
is of type C<Array[Int]> and one can create one with C<Array[Int].new>.  If
you intend to use an C<@>-sigiled variable only for this purpose, you may
change its type by specifying the type of the elements when declaring it:

    =begin code
    my Int @a = 1, 2, 3;              # An Array that contains only Ints
    my @b := Array[Int].new(1, 2, 3); # Same thing, but the variable is not typed
    say @b eqv @a;                    # says True.
    my @c = 1, 2, 3;                  # An Array that can contain anything
    say @b eqv @c;                    # says False because types do not match
    say @c eqv (1, 2, 3);             # says False because one is a List
    say @b eq @c;                     # says True, because eq only checks values
    say @b eq (1, 2, 3);              # says True, because eq only checks values

    @a[0] = 42;                       # fine
    @a[0] = "foo";                    # error: Type check failed in assignment
    =end code

In the above example we bound a typed Array object to a C<@>-sigil variable for
which no type had been specified.  The other way around does not work – you may
not bind an Array that has the wrong type to a typed C<@>-sigiled variable:

    =begin code
    my @a := Array[Int].new(1, 2, 3);     # fine
    @a := Array[Str].new("a", "b");       # fine, can be re-bound
    my Int @b := Array[Int].new(1, 2, 3); # fine
    @b := Array.new(1, 2, 3);             # error: Type check failed in binding
    =end code

When working with typed arrays, it is important to remember that they are
nominally typed. This means the declared type of an array is what matters.
Given the following sub declaration:

    sub mean(Int @a) {
        @a.sum / @a.elems
    }

Calls that pass an Array[Int] will be successful:

    =begin code :skip-test
    my Int @b = 1, 3, 5;
    say mean(@b);                       # @b is Array[Int]
    say mean(Array[Int].new(1, 3, 5));  # Anonymous Array[Int]
    say mean(my Int @ = 1, 3, 5);       # Another anonymous Array[Int]
    =end code

However, the following calls will all fail, due to passing an untyped array,
even if the array just happens to contain Int values at the point it is
passed:

    =begin code :skip-test
    my @c = 1, 3, 5;
    say mean(@c);                       # Fails, passing untyped Array
    say mean([1, 3, 5]);                # Same
    say mean(Array.new(1, 3, 5));       # Same again
    =end code

Note that in any given compiler, there may be fancy, under-the-hood, ways to
bypass the type check on arrays, so when handling untrusted input, it can be
good practice to perform additional type checks, where it matters:

    =begin code :skip-test
    for @a -> Int $i { $_++.say };
    =end code

However, as long as you stick to normal assignment operations inside a trusted
area of code, this will not be a problem, and typecheck errors will happen
promptly during assignment to the array, if they cannot be caught at compile
time.  None of the core functions provided in Perl 6 for operating on lists
should ever produce a wonky typed Array.

Nonexistent elements (when indexed), or elements to which C<Nil> has been assigned,
will assume a default value.  This default may be adjusted on a variable-by-variable
basis with the C<is default> trait.  Note that an untyped C<@>-sigiled variable has
an element type of C<Mu>, however its default value is an undefined C<Any>:

    my @a;
    @a.of.perl.say;                 # OUTPUT: «Mu␤»
    @a.default.perl.say;            # OUTPUT: «Any␤»
    @a[0].say;                      # OUTPUT: «(Any)␤»
    my Numeric @n is default(Real);
    @n.of.perl.say;                 # OUTPUT: «Numeric␤»
    @n.default.perl.say;            # OUTPUT: «Real␤»
    @n[0].say;                      # OUTPUT: «(Real)␤»

=head2 Fixed Size Arrays

To limit the dimensions of an C<Array> provide the dimensions separated by C<,>
or C<;> in brackets after the name of the array container. The values of such
an C<Arrays> will default to C<Any>. The shape can be accessed at runtime via
the C<shape> method.

    my @a[2,2];
    say @a.perl;
    # OUTPUT: «Array.new(:shape(2, 2), [Any, Any], [Any, Any])␤»
    say @a.shape;
    # OUTPUT: «(2 2)␤»

Assignment to a fixed size Array will promote a List of Lists to an Array of
Arrays.

    my @a[2;2] = (1,2; 3,4);
    @a[1;1] = 42;
    say @a.perl;
    # OUTPUT: «Array.new(:shape(2, 2), [1, 2], [3, 42])␤»

=head2 Itemization

For most uses, Arrays consist of a number of slots each containing a C<Scalar> of
the correct type.  Each such C<Scalar>, in turn, contains a value of that type.
Perl 6 will automatically type-check values and create Scalars to contain them
when Arrays are initialized, assigned to, or constructed.

This is actually one of the trickiest parts of Perl 6 list handling to get a
firm understanding of.

First, be aware that because itemization in Arrays is assumed, it essentially
means that C<$(…)>s are being put around everything that you assign to an
array, if you do not put them there yourself.  On the other side, Array.perl
does not put C<$> to explicitly show scalars, unlike List.perl:

    ((1, 2), $(3, 4)).perl.say; # says "((1, 2), $(3, 4))"
    [(1, 2), $(3, 4)].perl.say; # says "[(1, 2), (3, 4)]"
                                # ...but actually means: "[$(1, 2), $(3, 4)]"

It was decided all those extra dollar signs and parentheses were more of an
eye sore than a benefit to the user.  Basically, when you see a square bracket,
remember the invisible dollar signs.

Second, remember that these invisible dollar signs also protect against
flattening, so you cannot really flatten the elements inside of an Array
with a normal call to C<flat> or C<.flat>.

    ((1, 2), $(3, 4)).flat.perl.say; # OUTPUT: «(1, 2, $(3, 4)).Seq␤»
    [(1, 2), $(3, 4)].flat.perl.say; # OUTPUT: «($(1, 2), $(3, 4)).Seq␤»

Since the square brackets do not themselves protect against flattening,
you can still spill the elements out of an Array into a surrounding list
using C<flat>.

    (0, [(1, 2), $(3, 4)], 5).flat.perl.say; # OUTPUT: «(0, $(1, 2), $(3, 4), 5).Seq␤»

...the elements themselves, however, stay in one piece.

This can irk users of data you provide if you have deeply nested Arrays
where they want flat data.  Currently they have to deeply map the structure
by hand to undo the nesting:

    say gather [0, [(1, 2), [3, 4]], $(5, 6)].deepmap: *.take; # OUTPUT: «(1 2 3 4 5 6)␤»

...future versions of Perl 6 might find a way to make this easier.  However,
not returning Arrays or itemized lists from functions, when non-itemized lists
are sufficient, is something that one should consider as a courtesy to their
users:

=item use Slips when you want to always merge with surrounding lists
=item use non-itemized lists when you want to make it easy for the user to flatten
=item use itemized lists to protect things the user probably will not want flattened
=item use Arrays as non-itemized lists of itemized lists, if appropriate,
=item use Arrays if the user is going to want to mutate the result without copying it first.

The fact that all elements of an array are itemized (in C<Scalar> containers)
is more a gentleman's agreement than a universally enforced rule, and it is
less well enforced that typechecks in typed arrays.  See the section below
on binding to Array slots.

=head2 Literal Arrays

Literal Arrays are constructed with a List inside square brackets.  The List is
eagerly iterated (at compile time if possible) and values in the list are each
type-checked and itemized.  The square brackets themselves will spill elements
into surrounding lists when flattened, but the elements themselves will not
spill due to the itemization.

=head2 Mutability

Unlike lists, Arrays are mutable.  Elements may deleted, added, or changed.

    my @a = "a", "b", "c";
    @a.say;                  # OUTPUT: «[a b c]␤»
    @a.pop.say;              # OUTPUT: «c␤»
    @a.say;                  # OUTPUT: «[a b]␤»
    @a.push("d");
    @a.say;                  # OUTPUT: «[a b d]␤»
    @a[1, 3] = "c", "c";
    @a.say;                  # OUTPUT: «[a c d c]␤»


=head3 Assigning

Assignment of a list to an Array is eager.  The list will be entirely evaluated,
and should not be infinite or the program may hang.  Assignment to a slice of
an C<Array> is, likewise, eager, but only up to the requested number of elements,
which may be finite:

    my @a;
    @a[0, 1, 2] = (loop { 42 });
    @a.say;                     # OUTPUT: «[42 42 42]␤»

During assignment, each value will be typechecked to ensure it is a permitted
type for the C<Array>.  Any C<Scalar> will be stripped from each value and a
new C<Scalar> will be wrapped around it.

=head3 Binding

Individual Array slots may be bound the same way C<$>-sigiled variables are:

    my $b = "foo";
    my @a = 1, 2, 3;
    @a[2] := $b;
    @a.say;          # OUTPUT: «[1 2 "foo"]␤»
    $b = "bar";
    @a.say;          # OUTPUT: «[1 2 "bar"]␤»

...but binding Array slots directly to values is strongly discouraged.  If you do,
expect surprises with built-in functions.  The only time this would be done is if
a mutable container that knows the difference between values and Scalar-wrapped
values is needed, or for very large Arrays where a native-typed array cannot be used.
Such a creature should never be passed back to unsuspecting users.

=end pod

# vim: expandtab softtabstop=4 shiftwidth=4 ft=perl6
